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Abstract. We prove asymptotic equipartition properties for simple hierarchical struc- 
tures (modelled as multitype Galton- Watson trees) and networked structures (mod- 
elled as randomly coloured random graphs). For example, for large n, a networked 
data structure consisting of n units connected by an average number of links of order 
n/logn can be coded by about H x n bits, where H is an explicitly defined entropy. 
The main technique in our proofs are large deviation principles for suitably defined 
empirical measures. 



1. Introduction 

Information is often structured in a nonlinear way. For example, in genetics information often has an 
implicit hierarchical structure, in computer science data is often organized in the form of a network. 
To transmit or compress data from these sources, one needs efficient coding schemes and approximate 
pattern matching algorithms, and the Shannon- McMillan-Breiman theorem or asymptotic equiparti- 
tion property (AEP) plays a key role in this regard, for example by providing bounds on the possible 
performance of algorithms. 

Two major sets of research work on the AEP (and its applications) within mathematics and informa- 
tion theory have so far been considered. The first of these has focussed on stationary ergodic processes 
such as Markov chains, see Cover and Thomas [5j and the references therein. The second has dealt 
with stationary (ergodic) random fields on lA, as well as amenable group actions, see, for example 
Dembo and Kontoyiannis [9j and the reference therein. Whilst typical examples of applications of the 
former has concentrated on data from linear source, the latter includes recent advances such as image 
and video processing, geostatistics, and statistical mechanics. 

However, numerous types of data we usually come across in applications (communication studies, 
demographic studies, biological population studies and the field of physics) are naturally structured 
like networks or trees. For example, the WWW (consisting of a collection of pages residing on a server 
with a given name, together with 'hyperlinks' with their direction ignored), data on the spread of 
some disease in a given population and many more, can be described by networks. Equally, the age 
structure of a given population is best modelled by genealogical trees. 

In this paper we use the large deviation techniques, as provided in the recent paper Dembo, Morters 
and Sheffied |8j, to study the AEP of structured data consisting of a large number of units, chosen 
from a finite set, together with a number of links connecting the units. 

As an application of our abstract principles, we consider the following concrete examples from biology. 
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• Metabolic network: This is a graph of interactions forming a part of the energy generation 
and biosynthesis metabohsm of the bacterium E.coh. Here, the units represent substrates and 
products, and Hnks represent interactions. See Newman jl3j . 

• Mutation study: Consider mutations in mitochondrial DNA (mtDNA for short) e.g. the 
mtDNA^^''^ deletion (a mutation which causes a deletion of about one third of the mitochon- 
drial genome). The replication of mtDNA can be described by a tree, where the units are 
a (normal) and b (mutant) and the links indicate 'mother-child' relations. See Olofsson and 
Shaw [H] and the references therein. 

The core results of the paper are the Shannon- McMillan-Breiman theorems for two simple probabilistic 
models: The multitype Galton-Watson trees describing hierarchical data structures, and a class of 
randomly coloured random graphs describing networked data structures, see Theorems 12.21 and 12. 1[ 

Specifically, we consider for the first model typed trees described by the following procedure: The 
root carries a random type chosen according to the some law on a finite alphabet; given the type of 
a vertex, the number and types of the children (ordered from left to right) are given independently 
of everything else, by an offspring law. For the second we look at random graph models constructed 
as follows: Assign vertices colours independently and identically according to some colour law on a 
finite set of colours; connect any pair of vertices independently according to a probability depending 
on their colours. This model, with the simple Erdos-Renyi graph with independent colours as a special 
case, was introduced by Penman in his thesis p^5j, see Canning and Penman ^ for an exposition. 

We also present large deviation principles (LDPs) for empirical colour measure and empirical pair 
measure of sit6-and supercritical coloured random graphs. Major tool used in the proofs of these 
LDPs is (exponential) change of measure. We remark here that some of our results fit well into the 
framework of large deviations for mixtures which is utilized in the proofs of the LDPs in Doku and 
Morters [7]. 

1.1 A model for simple hierarchical structures. We review in this subsection, the model for 
simple hierarchical data structures, multitype Galton-Watson trees. To begin, we collect some notation 
and concepts from the paper Dembo et al. [8] . By T we denote the set of all finite rooted planar trees 
T,hy V = V{T) the set of all vertices and hy E = E(T) the set of all edges oriented away from the 
root, which is always denoted by p. We write |T| for the number of vertices in the tree T. Let be a 
finite alphabet and write 

oo 

X* =\J{n}x ;f". 

n=0 

We equip X* with the finest topology with all subsets as open sets. i.e. the discrete topology. We 
observe that the offspring of any vertex v € T is characterized by an element of X* and that there is 
an element (0, 0) in X* symbolizing absence of offspring. 

Let /i be a probability measure (initial distribution) on X and Q : ^ x X* [0, 1] be an offspring 
transition kernel. The law P of a tree-indexed process X is defined by the following procedure: 

• The root p carries a random type X{p) chosen according to the probability measure p on X. 

• For every vertex with type a ^ X the offspring number and types are given independently of 
everything else, by the offspring law Q{- | a} on X* . We write 

Q{ • \ a} = q{{N,Xi,...,XN) G - la}, 
ie we have a random number of descendants with types Xi, ...,Xn. 

We shall consider X = {{X[v), C{v)), v V) under the joint law of tree and offspring. We interpret 
X as multitype Galton-Watson tree and X{v) as the type of vertex v. For each typed tree X and each 
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vertex u we denote hy C{v) = {N{v), Xi{v), . . . , Xj\f(^^-^{v)) € .Y*, the number and types of the children 
of V, ordered from left to right. We notice that the children of the root (denoted by p) are ordered 
but the root itself is not. We call an offspring distribution Q bounded if for some A'^o < cxd, we have 

Q{N > iVo I o} = 0, for all aeX. 

Denote, for every c = (n(c), ai(c), . . . , a„(c)) S X* and a (z X, the multiplicity of the symbol a in c by 

n(c) 

m{a,c) = ^ l{a,=a}- 
i=l 

Define the matrix A with index set X x X and nonnegative entries by 

A{a, 6) = ^ Q{c I b}m{a, c), for a,be X. 

A{a,h) is the expected number of offspring of type a of a vertex of type h. Let A*{a^b) = 
XlfcLi A^{a, b) G [0, oo]. We say that the matrix A is irreducible if A* {a, b) > 0, for all a,b e X. 

The multitype Galton- Watson tree is called irreducible if the matrix A is irreducible. It is called 
critical (subcritical, supercritical) if the largest eigenvalue of the matrix ^ is 1 ( less than 1, greater 1 
resp.). Let vr be the eigenvector corresponding to the largest eigenvalue of the matrix A (normalized 
to a probability vector). Then vr is unique, if the Galton- Watson tree is irreducible. 

1.2 A model for simple networked structures. In this subsection, we review the model for 
simple networked structures, the randomly coloured random graph model. We begin by fixing the 
following notations. Let X he a finite alphabet or colour set X. Let V he a, fixed set of n vertices, say 
V = {1, . . . ,n}. Denote by Q the set of all (simple) graphs and by Qn the set of all (simple) graphs 
with vertex set y = {1, . . . , n} and edge set 

E CS := {{u,v) £V xV : u<v}, 

where the formal ordering of edges is introduced as a means to simply describe unordered edges. 

Given a symmetric function pn'. X x X [0,1] and a probability measure fj, on X we may define the 
randomly coloured random graph or simply coloured random graph X with n vertices as follows: Assign 
to each vertex v £ V colour X(v) independently according to the colour law fx. Given the colours, 
we connect any two vertices u,v € V, independently of everything else, with a connection probability 
Pn{X (u) , X (v)) otherwise keep them disconnected. We always consider X = ((A(ti) : v € V),E) 
under the joint law of graph and colour. We interpret X as coloured random graph and consider X{v) 
as the colour of the vertex v. Denote by Qn{X) the set of all coloured graphs with colour set X and 
n vertices. 

We look at the coloured random graph models in three regimes, the near- critical, subcritical and 
supercritical cases. Thus, we consider the cases when the connection probabilities satisfy a~^pn{a, b) — > 
C{a, b), for all a, 6 € A', where the sequence (a„) is such that either nan ^ 1 or na„ ^ or na„ — )• oo 
and C: X X X ^ [0, oo). 

The rest of the paper is organized in the following way. In section 2 all our results are stated. We 
state the Asymptotic Equipartition Properties for both models in subsection 2.1, beginning with the 
case of simple hierarchical structures and then followed by the simple networked structures case. In 
subsection 2.2, we compute the asymptotic number of bits needed to encode large amount of data from 
the model of the mtDNA^^''^ and the metabolic network. Section 3 contains proof of main results. 
We state and prove some large deviation principles for subcritical and supercritical coloured random 
graphs. We derived our main results from Theorems 13. !( [3^ 13.31 and 13.41 
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2. Statement of main results 



2.1 Asymptotic Equipartition Properties. The underlying question is, how many bits are 
needed to store or transmit the information contained in a structured data consisting of n units 
connected by number of hnks? 

Clearly, if no probabilistic structure is imposed, one needs of order n bits to transmit the units and of 
order bits to transmit the links of the network data structure. By imposing a probabilistic structure 
one can often transmit the structure at much cheaper cost with arbitrarily high probability. This is 
explained by the Shannon-McMillan-Breiman theorems for networked structures modelled as sparse 
coloured random graphs, and hierarchical structures modelled as multitype Galton- Watson trees. 

Suppose q is the distribution of a message Yn generated by a hierarchical or networked source and 
let H be entropy of the source. Then, we shall say — log2 qiYn) ~ nH bits with high probability if as 
n — )• oo, 

-Ti log2 Qi^n) H in probability. 

We denote by the law of a multitype Galton- Watson tree conditioned to have n vertices and write 

p^{x) := ¥n{X = x},ioi X e r. 

We state the asymptotic equipartition property (AEP) for simple hierarchical data structures. 

Theorem 2.1. Suppose X = {X{v) : v G V{T)) is an irreducible, critical multitype Galton- Watson 
tree with finite type space X and bounded offspring kernel Q. Then, for every e > 0, 



lim Pnj - ^ log Pn{X) + V ^(a)Q{c | a} log 

{a,c)€XxX* 



c I a} ^ ^1 



0. 



We can extract from Theorem 12.11 the following useful information: To transmit the information 
contained in a large critical multitype Galton- Watson tree one needs with high probability, about 

1 



n 



log 2 



E 



TT ( 



a,c 



logQ{c|a} 



bits, 



where n is the number of vertices in the tree. We consider the following example from the field of 
biology. 

Mutations in mitochodrial DNA. Mitochondria are organelles in cells carrying their own DNA. 
Like nuclear DNA, mtDNA is subject to mutations which may take the form of base substitutions, 
duplication or deletions. The population mtDNA is modelled by two-type process where the units are 
a (normal) and b (mutant), and the links are mother-child relations. A normal can give birth to either 
two normals or, if there is mutation, one normal and one mutant. Suppose the latter happens with 
probability or mutation rate a. Mutants can only give birth to mutants. A DNA molecule may also 
die without reproducing. 

Let the survival probabilities be p S [O, (2^^] and q G [O, ^] for normals and mutants resp. We 
assume that the population is started from one normal ancestor. Suppose the offspring kernel Q is 
given by Q{(O,0)|a} = 1-p, Q{{2, {a, b))\a} = pa, Q{(2, (a, a))|a} = p(l - a), Q{(0, 0)|6} = 1 - g 
and Q{(2, (6, 6))|6} = q. Then, the process A is a multitype Galton- Watson process with matrix A 
(with index set {a, b}) given by 

^p{2 -a) 
pa 2q, 



A 
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We restrict ourselves to the special case when p = q = ^ and a > 0. This case corresponds to the 
model for non-dividing tissue such as the brain. This means that the population of mtDNA is kept 
constant on average but that mitochondrial DNA keeps reproducing also in non-dividing cells. See, 
for example Arking [l] or Olofsson and Shaw [14] and the references therein. 

We observe that, in this special case X is critical and irreducible, with 7r(a) = 7r(6) = ^. Therefore, 
by Theorem 12.11 one needs with high probability approximately. 



n 



1 ~ I3il6(«l°g" + (1 ~ ~ ")) bits, (2.1 



in order to store or transmit data from a model of non-dividing tissues. For more examples of data 
source with tree structure, we refer to Kimmel and Axelrod 1 11 1 or Mode 1121. 



We state the AEP for networked data structure described by random coloured graphs. By P„, we also 
denote the (probability) law of a coloured random graph with n vertices. We write 

Pn{x) = Fn{X = X}, for X G Gn- 

Theorem 2.2. Suppose that X is a coloured random graph with colour law fi: X — > (0,1] and con- 
nection probabilities Pn such that a~^pn{a,b) — )■ C{a,b) for some sequence (an) with a„nlogn — >■ oo 
and log a^/ log n — > —1. Then, for every e > 0, 

- ^;:^i°g^"W - 1 E Ka)Cia,bMb)\ >e}=0. 

In other words, in order to transmit a coloured random graph in the given regime one needs with high 
probability, about 

"'2"log2^ E ^(a)C(a,6)/i(6) bits. 

The most interesting regime is when the cost of transmitting colours and transmitting edges is of 
comparable order, i.e. when = nlogn. In this case one obtains the following Shannon-McMillan- 
Breiman theorem. 

Theorem 2.3. Suppose that X is a coloured random graph with colour law /i: ^ — )• (0, 1] and connec- 
tion probabilities pn such that (nlogn) Pnia,b) — > C{a,b) for C : X x X ^ [0,oo) symmetric. Then, 
for every e > 0, 

Jim P„{| -ilogP„(X)-i ^^{a)C{a,b)^l{b) + Y,^l{a)\og^i{a)\>e'^=G. 

a,beX aeX 

Interpretation. From Theorem 12.31 one can deduce that, the number of bits needed in order to code 
a networked data structure consisting of n units connected by an average number of order n/logn 
links with high probability is about nH, where H is the entropy defined by 

^■■=]^[^H ^(a)C(a, bMb) - Ka) log /^(a)] • (2-2) 

^ a,b&X aeX 

Metabolic network. We consider a metabolic network of the energy and biosynthesis metabolism 
of the bacterium E.coli. Here, the units represent substrates and products, and links represent inter- 
actions. Suppose half the nodes in the graph are of unit a (substrate) and half are of unit b (product), 
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and link between pair of units {a,b) occur independently with connection probability ' , where 
C : {a, b} X {a, b} [0, oo) is nonzero symmetric function and n the size of the graph. We write 

H ■■= 8^{^C{a, b) + C{a, a) + C{b, b)). 

Then, by Theorem 12.21 one needs with high probability about (n log n) H bits to transmit the data 
contained in the metabolic network of the bacterium E.Coli. 

3. Proof of main results 

3.1 LDP for the empirical offspring measure. We present the recent large deviation principle 
for empirical offspring measures on random trees, see Dembo et al. [8]. We recall from the introductory 
section that |r| is the number of vertices and V = V{T) is the set of all vertices in tree T. We also 
recall that m(a, c) is the multiplicity of the symbol a in c = (n, oi, . . . , a„) and that 

oo 

X* = J{n} X ;f". 

For every multitype Galton- Watson tree X, the empirical offspring measure Mx is defined by 

Mx(a, c) = 1^ ^ ^(X{u),c{v)) {a, c), for (a, c) e X x X*. 

' ' vev 

We call V shift-invariant if vi{a) = m{a,c)ij{b,c), for all a ^ X x X* . 

{b,c)eXxX* 

We denote by A4{X x X*) the space of probability measures u on X x X* with J nv^da ,dc) < oo, 
using the convention c = (n, ai, . . . , a„). We endow this space with the smallest topology which makes 
the functionals J f(b, c) v^db , dc) continuous, for f : X x X* — t- M either bounded, or 

f{b,c) = m(a, c)]lb„(6) for some a, 6o £ 

Theorem 3.1 ( Dembo et al. [8]). Suppose that X is an irreducible, critical multitype Galton-Watson 
tree with an offspring law whose exponential moments are all finite, conditioned to have exactly n 
vertices. Then, for n — > cx), the empirical offspring measure Mx satisfies a large deviation principle 
in M{X X X*) with speed n and the convex, good rate function 

_ f H{v II vi Q) if V is shift-invariant, , 
^ ' y oo otherwise. ' 

We remark that the critical and noncritical cases give the same tree under conditioning. See Dembo 
et al. [8]. 

3.2 Large deviation principle for sparse random coloured graphs. 

For any finite or countable set y we denote by M{y) the space of probability measures, and by M{y) 
the space of finite measures on y, both endowed with the weak topology. We denote by A^*(3^ x y) 
the subspace of symmetric measures in ^A{y x 3^). We recall that V is fixed set of n vertices and 
E C £ := [{u,v) & V xV : u < v] is the edge set. 

We associate with any coloured random graph X with n vertices a probability measure, the empirical 
colour measure G M.{X), defined by 
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and a symmetric finite measure, the empirical pair measure L'^ £ x X), defined by 



L^{a,b) := ^ [S(x(v),X{u)) + S(X{u),X{v))]ia,b), for o, 6 G . 

The total mass \\L'^\\ of is 2|£'|/(n^a„). The next theorem is the LDP for the empirical colour 
measure and the empirical pair measure of a class of sparse coloured random graphs, i.e = ^. 

Theorem 3.2 (Doku et al. [7]). Suppose that X is a coloured random graph with colour law /i and 
edge probabilities satisfying npn{a,b) C{a,b) for some symmetric function C : X x X — )• [0,oo). 
Then, as n ^ co, the pair {L^,L'^) satisfies a large deviation principle in A4{X) x A4^(X x X) with 
good rate function, 

I{uj,w) = H{uj \\ fi) + ^Sjci"^ \\^) , (3.2) 

where ioc(^ II ^) '■= H(w \\ Coji^vo) + HCw^wH — ||ti7|| is a non-negative function and Cuj(Sii^{a, b) := 
C{a, 6)a;(a)cj(6). 

Remark 1 By exponential equivalence, see Dembo and Zeitouni [10^ Theorem 4.2.13], one can obtain 
from Theorem 3.2 the LDP for (L^ , L^) of any coloured random graph X with connection probabilities 
satisfying a~^pn{a,b) — >■ C{a,b), for some sequence (o„) with non — > 1 and C : X x X ^ [0, oo) 
symmetric. 

The proof of Theorem 13.21 uses the Gartner-Ellis theorem, and the technique of mixing, see Biggins [2]. 

3.3 Large-deviation principles in the sub- and supercritical cases. We use large deviation 
techniques to study asymptotic properties of the coloured random graphs for large n in the subcritical 
and supercritical cases. In the rest of the paper, we assume that (a^) — >■ as n approaches infinity. 

Theorem 3.3. Suppose that X is a coloured random graph with colour law fi: X — > (0,1] and edge 
probabilities pn '■ X xX ^ [0,1] satisfying a~^pn{a, b) C{a, b), for some sequence (a„) with nan — > oo 
and C : X X X ^ [0, oo) symmetric. Then, for n — )• oo, the pair {L^,Lp') satisfies a large deviation 
principle in M.{X) x Ad.^{X x X) with speed 

(i) Onri^ and good rate function, 

/i(a;, ro) = ii5c'(ti7 II w). (3.3) 

(ii) n and good rate function, 

^ ' I oo otherwise. ^ 

Remark 2 Intuitively this means that, on the scale a„n^ the colour law can be changed 'for free', 
whereas on the scale n once the colour law is fixed, the edge law has to be the typical one. 

Theorem 3.4. Suppose that X is a coloured random graph with colour law fi: X — )• (0, 1] and edge 
probabilities pn '■ X x X ^ [0,1] satisfying a~^p„(a, b) — > C(a, 6), for some sequence (an) with nan 
and C : X X X ^ [0,oo) symmetric. Then, for n — > oo, the pair {L^,Lp') satisfies a large deviation 
principle in M.[X) x M^[X x X) with speed 

(i) a„ra^ and good rate function, 

/3(u;,z^) = n^^^^ll'") 1'"=.^' (3.5) 
oo otherwise. 
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(ii) n and good rate function, 



14(0;, Tu) = H{uj W fl ) 



(3.6) 



Remark 3 Intuitively this means that, on the scale n the edge law can be changed 'for free', whereas 
on the scale a„n^ the colour law cannot be changed. 

Biggins and Penman [3] have proved large deviation principle for the 2\E\/n{n — l) using the technique 
of mixing, see Biggins [2]. 



In the rest of the section we give the proofs of the large deviation principles (LDPs) for coloured random 
graphs in the sub- and supercritical regimes, and use our LDPs and Theorems 13.11 and 13.21 to prove 
the asymptotic equipartition properties for simple hierarchical and networked structures. We prove 
our large deviation principles using the technique of (exponential) change of measures. Specifically, we 
use the technique of change of measure to prove the Upper bounds in Theorems 13.31 and 13. 4[ We then 
obtain the proofs of all Lower bounds except Theorems l3.3( i) and l3.4r ii) from the Upper bounds. Lower 
bounds of Theorems I3.3( i) and I3.4( ii) are proved from Lower bounds of Theorems I3.3( ii) and I3.4( i) 
respectively. All our proofs use two important Lemmas, Euler's Formula and Exponential Tightness 
lemma, which we shall state and prove in Subsection [37 



3.4 Some Useful Lemmas. 

Lemma 3.5 (Euler's Formula). // a^^pn{(i,b) — t- C{a,b), for all a,b £ X and (a„) — )■ 0, then 
lim h + apn{a, b)V"' = e"^^"'''), for all a,b(£X andaeR. 



(3.7) 



Proof. Observe that, for any e > and for large n we have 



1 + an{aC{a, b) — e) 



< 



1 + apn{a,b) 



< 



1 + an{aC{a,b) + e) 



{31 



by the pointwise convergence. Hence by the sandwich theorem and Euler's formula we have (13.71) . 



Note, P is used instead of P„ in all our large deviation analysis for sake of a simple presentation. 
Lemma 3.6 (Exponential Tightness). For every a > 0, there exists G N such that 

limsup^logPjl^l > Onu'^Nl < -a. (3.9) 

Proof. Let c > maxa,bex C{a,b) > 0. Using Chebysheff's inequality and Lemma 13.51 we have (for 
sufficiently large n) 

n(n— 1) 



n(n-l)/2 , 
k=0 ^ 



k/ \n{n— 1)/2— fc 



-a„n^l 



1 + (e — l)a„c 



= e 

< g-a„n2/ga„n2{c{e-l+o(l))) 
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Now given a choose iV G N such that N > a + c{e — 1) and observe that, for sufficiently large n, 

which implies the statement. ■ 
3.5 Change of measure on the scale n. 

We denote by C2 the space of symmetric functions on X x X and by Ci the space of functions on X. 
Given a function f-.X^M. and a symmetric function g: X x X define the constant Uj hy 

C// = log^e-^»M(a)- 

~ (2) 

For the function ^ G C2 we define the symmetric function hn : X x X ^Mhj 

h!^\a,b) = log [(l-pn{a,b) +p„(a,6)e^(«'^)/"«")""]. (3.10) 

Use /, g to define (for sufficiently large n) a new coloured random graph in the following manner. 

• To the vertices V = {1, ...,n} we assign colours from X independently and identically 
according to the colour law defined by 

lc{a) = e^'(")~^/M(a). 

• Given any two vertices u,v & V, with u carrying colour o and v carrying colour 6 connect 
vertex u to vertex v with probability 

otherwise keep u and v disconnected. 

For this new graph, observe ]1 is a probability measure and further that Pn{a, b) G [0, 1], for all 
a,b £ X. Denote by P the law of the new coloured random graph construct from Jl and p. We note 
from the construction of the new graph that P is absolutely continuous with respect to P, as for a 
coloured random graph X, 

^(Y\-Y\ ^(-^W) TT TT l-pr.{X{u),X{v)) 

dF^ ' ~ 11 11 Pn{X{u),X{v)) 11 l~p„(X(u),X(v)) 

u&V {u,v)&E {■v,u)^E 

n K^ju)) TT pn{X{u),X{v)) n-npr,{X{u),X{v)) TT n-np„{X (u) ,X (v)) 
l^{X(u)) 11 pniXiu),X{v)) ^ n-np„(X(u),X{v)) 11 n-np„iXiu),X(v)) 

uev {u,v)eE {u,v)e£ 

= JJ J'-^^'^^'l-^S JJ ^g{X{u),X{v))/nan JJ ^hi^\x{u),X{v))/n 

uev {u,v)eE {u,v)e£ 

where L\{a, a) = ^ X^^ev ^{X{u},x(u)){a, a), for a G and X^aeA- ^Ai^, a) = 1. 
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3.6 Change of measure on the scale a„n^. Define for 5 € C2, hn^ : X x X ^Why 

U^\a,b) = - log [(1 -p„(a,6) +pn(a,5)e^('^'''))'^""" . 

Define for / S Ci and g a new coloured random graph (for sufficiently large n) in the following way: 

• Assign to the n vertices in V colours from X independently and identically according to the 
colour law Jl defined by 

)5(a) = e-^'(")-^/>(a). (3.13) 

• Given any two vertices u,v G V, with u carrying colour a and v carrying colour b connect 
vertex u to vertex v with probability 

otherwise keep u and v disconnected. 

Note the colour law /i is a probability measure and the connection probabilities satisfy Pnia, b) G 
[0, 1], for all a,b € X. We denote by P the law of the coloured random graph obtained from Jl and pn- 
By construction P is absolutely continuous with respect to P, as for a coloured random graph X, 

^(Y^-'nEBMl TT Mx{u),x(v)) TT i-Mxiu),xiv)) 

^l-^J - 11 fj.{X{u)) 11 p„{X{u),X{v)) 11 l-pr,(X(u),X{v)) 

nJ(X(u))-Uf TT MXiu),Xiv)) l-p„iXiu),Xiv)) TT l-pr,(X{u),X(v)) 
^ ii pAX(u),X{v)) ^ l-p„{X{u),X{v)) ii l^pn{X(u),X{v)) 

u£V {u,v)eE (u,v)e£ 

= JJ e/{^W)-C^/ JJ ^g(X{u),X{v)) -Q ^a„hi,'\x(u),Xiv)) 

uGV (u,v)&E {u,v)&£ 

^ gn(L\/-C/^.)+a„n2(iL2,g)+a„n2(iLi®Li,/iL'')-«nn2(iL2^,/ili)) 

where L^(a, a) = ^ EugV ^(,x(u),x(u)){a', a), for a G A" and Ylaex ^1(0' ") = ^■ 

3.7 Upper bound in Theorem I3.3( ii). Define for (uj,-co) G M{X) x M^{X x A"), 12(1^, ro) by 

12(0;, ^7) = sup { ^ (/(o) - t/^Xa) + I5(a, b)(w(a, &) " C(a, 6)a;(a)a;(6))} (3.16) 

^fj;! a&X a,h&X 
9SC2 

Lemma 3.7. For each closed set F C A4..^{X x A"), w;e have 

limsupilogP{(L\ L^) G F} < - inf /2(w, tJ^). 

n— i-oo ((.J, ti7)e-F 

Proof. Fix / G Ci. For any g G C2, we define /3 : A" x A" ^ R by ^(a, 6) = -5(0, 6)C(a, b). 

We notice from Lemma 13.51 that, lim hf^\a,b) = (3(a,b), for all a,b ^ X. Hence, for any 6 > and 

n— >oo 

for (sufficiently) large n, we have 

h'nHa^b) <\/3{a,b)\+6, forana,6G-^. (3.17) 
Using ([XT2D and ([HTT]) we obtain 

g(max„g;k. |/3(a,a)H-5)/2 > /" g <5^A' ')(iP = e| (i' : /-C^/>+" (1^'. 9>+« | ^ 
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for any 5 > and for large n. Therefore, we have 

hmsupilogE|e"<^''^'-^/>+"<^^''^>+"<^^'^^''^"H<0. (3.18) 

We now fix e > and write tu) := min{i2(a;, zu), e~^}—e. Let F be a closed subset of A^*(Af x X) 
and suppose (a;, ro) G F. Choose f £ Ci, g £ C2 such that 

{u, f -Uj) + g) - ^{uj(S)uj, Cg) > i^{uj, w). 

Since X is finite, we can find open neighbourhoods B'^ and of zu, to such that 

inf ^ {{Co, f -U?) + ^{w, g) - I {uj ®uj, Cg)} > /|(a;,tx7) - e. 

Using Chebysheff's inequality and ()3.18p . we have that 

limsup i logP{(L\L2) e i?i X 5^} 

<limsupilogE|e"<^''^-^/>+"<^^''^>+"<5^'®^''^"'>|-I|(a;^ (3.19) 

ra— )-oo 

< — /|(u;, tu) + e. 
Use Lemma 13.61 to choose G N such that 

limsup ^ logPj |£;| > a„n^iv| = -00. 

For this define the set K]\f by 

Kn = {{uj, w) G M{X) X 7W,(A:' x X) : < 2iV}. 

The set ETtv H is compact and therefore may be covered by finitely many sets 
i?^^ X i?^^ with (w,., -ajr) G F for r = 1, . . . , m. Hence, we have 

m 

P{(Li, l2) g F} < ^F{(L\l2) G < X +P{(L\l2) ^ i^^}. 

r=l 

We may now use (|3.19p to obtain, for all sufficiently small e > 0, 

hm sup i log P{ (Li , L2) G F} < max lim sup ^ log P{ (L^ , L^) G x S^,, } - 00 

< — inf /|(a;, ru) + e. 

{i.j,ro)GF 

Taking e J, we have the desired statement. ■ 
We solve the variational problem on the right side of equation ()3.16p . 



Lemma 3.8. /2('^i w) = B{uj || /u) if (and only if) w = Coo (X" to, and 00 otherwise. 
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Proof. Suppose that w ^ Cuj^uj. Then there exists qq, bo ^ X such ro(ao, 60) > C{aQ, 6o)'^(oo)'^(^o) 
or w{ao,bo) < C{ao,bQ)u{ao)uj{bo). Define for this ao,6o S the symmetric function g by 

gia,b) = if(l(„,,,„)(a,6) + %o,,o))(a, 6), for a, 6 G and G M (3.20) 

Considering this g in ()3.16p we have 

^ ^5(a,6)w(a,&) + ^ - i5(a,6)C(a,6)a;(a)u;(6) 

a.bGA- a,beX (3,21) 

= K(tx7(ao, 60) - C(ao, 6o)w(ao)u;(6)) '^'^°"> oo, 

where the sign of |i^| is chosen such that expression in the right side of (I3.25[) remain positive. 

Suppose that zu = Coo ® oo. Then, by the variational characterization of the relative entropy we have 
/2(ci;, w) = H{ijj II /i), which ends the proof of the upper bounds. ■ 



3.8 Lower bound Theorem I3.3( ii). 

Lemma 3.9. For every open set O C M{X) x M^{X x X), 

liminf ilogP|(L\L2) e 0} > - inf Muj, w). (3.22) 

n-i>oo (tj,ro)GO 

Proof. Suppose {uj, ro) € O is such that we have vj = Ceo ® a;. Set g{a, b) = 0, for all a,b & X and 
define ^ : ^ M by 

= / log Si' if^(o) >0' 
1 0, otherwise. 

We note that this choice of g yields hn^ (a, b) = 0, for all a,b £ X . Choose B^, open neighbourhoods 
of UJ, zu, such that B^ x B"^ C O and for all (a), w) e B^ x B^, 

{U,uj) - e < {U,u)). 

We use the probability measure P given by g^ . We observe that the colour law is uj and the connection 
probabilities satisfy a~^pn{a,b) C{a,b) := w{a,b) / {uj{a)uj{b)) , as n approaches infinity. Therefore, 
using (j3.12p we have that 

> e-"<^'/")-"^ X p{(L\l2) eBlx Biy 

Therefore, we have 

liminfilogP{(L\L2) eO} > -(a;, /^) - e + liminf i logP{(L\ L^) G 5^ x S^}. 

The result follows once we prove that 

liminfilogP{(L\L2) Gi?i xi?^} =0. (3.23) 

We use the upper bound (but now with the law P replaced by P) to prove (j3.23p . 



Therefore, we have 
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limsupilogP{(L\L2) G {bI x Blf] < - inf J2iu^,^), 

J r~ _ / ^('^11'^) if tx7 = (7(1) (g) (Z), 
^ I oo otherwise, 

where F = (B^ x B^y. It therefore suffices to show that the infimum is positive. Suppose for 
contradiction that there exists a sequence (a;„,ro„) E F with /2('^ni^n) 4- 0. Then, since I2 is a 
good rate function and its level sets are compact, and the mapping {u,w) H- I2{oj,-uj) is lower 
semicontinuous, we can construct a limit point {Cj,w) € F with l2{u},w) = 0. By Lemma [3.81 this 
implies H{uj =0 and w = CCj (8) w, hence w = w, and = CCj ® Cj = w. This contradicts 



3.9 Upper bound in Theorem I3.3( i) . We obtain the upper bound in a variational formulation. 
We define for {uj,w) G M{X) x M^:{X x X) the rate function Ii by 

/i(u;,ti7) = sup{ i5(a,6Ma,6)+ ^ i(l - e^('^'^))C(a, 6)a;(a)a;(6)| . . 



.24) 



Lemma 3.10. For each closed set F C M(X) x M^{X x A"), we have 

lim sup ^ log P{ (L^ , L^) e F} < - inf h {u, zu). 

n->-oo " (w,ro)GF 

Proof. For any 3 G C2 we define /3 : x Af — ;> M by 

/3(o,6) = (l-e^('*'^))C7(a,6). 
Prom Lemma [33] we note that lim hl^\a, b) = P{a, b), for all a,b & X. We observe that, for any 6 > 

n— ^00 

and for (sufficiently) large n we have 

h^n^ (a, fc) < |/3(a, 6) | + 5, for all a,b£X. (3.25) 

We take /(a) = 0, for all a € ^, and use p.lSp and (13.25j) to obtain 

for any 5 > and for large n. Therefore, we have that 

limsup^logE|e""'''<5^''^>+'^"'^'<5^'®^''^"H < 0. (3.26) 

Fix e > and take If (w, ro) = min{/i(cL!, ■w),e~'^} — e. Suppose (w, w) ^ F and choose 5 G C2 such 
that 

\{w, g) + i(a; (g)a;, /3) > /f(a;,tx7). 
Using the finiteness of X we can find open neighbourhoods -B^, i?^ oi uj, w such that 

inf ^) + (g) lD, > If (tj, ro) - e. 
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By Chebysheff's inequality and (|3.26p . we have that 
limsup^logP{(Li,L2) G Si X 3%) 

n— >oo 

< limsup ^ logEje"""' (\L\g)+a^n^ (\l^®L\U^^) | _ fe^^^^ ^ ^ 

< -/f(w, tu) + e. (3.27) 
We use Lemma (j3.6p with a = to choose N{e) S N such that 

hmsup^logPjl^l > ann'^N{e)\ < -e'^ 

Define for this A^, the set i^Ar(£) by 

KN{e) = {{^, ^) e M{X) X A4*(A' X X) : \\w\\ < 2N{e)]. 

Now, observe that Kj^i^ir-^ Pl-F is compact and therefore may be covered by finitely many sets -B^ x B"^^, 
r = 1, m with (cj^, Wr) € -F for r = 1, m. Hence, we have 

rra 
r=l 

Using (|3.27p for small enough e > 0, we obtain 

limsup^logP{(Li, L2) g F} < mix limsup^ log P{(Li, L^) G B^^ x -e"^ 

< —Il{ijJ, -co) + e. 

Taking e J, we have the desired statement. ■ 

We identify the rate function by solving the variational problem in the right side of equation ()3.24p . 
Lemma 3.11. For any (a;, w) G M.{X) x x X) we have zu) = S^ci'^ II 

Proof. (i) Suppose ^ Coj ® oj . Then there exists oq, 6o G Af with C(ao, 6o)a;(ao)6t;(6o) = and 
ro(ao,6o) > 0. For this (ao,&o) we define the symmetric function g by 

g{a,h) = log(i^(]l(„o_;,„)(o,6) + ll(;,g_ao)(«> ^)) + 1)> 
for a, 6 G A' and K > {). Considering our g in (j3.24p we have 

^ \~g{a,h)vo{a,h) + ^ i(l - e^("'^))C(a, 6)w(a)a;(6) = log(i^ + l)(t77(ao, 6o)) oo. 

Suppose that vo <C Cw ® uj. Then, we have 

I(w,ro)>isup| y 5(a,6)ro(a,6) - V e»("''')C(a, 6)a;(a)cj(&)| + i V C{a,b)u}{a)u{b). 



2 



(3.28) 
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Using the substitution h = '^'•^'^ and sup^-^ologx — x = — 1 we obtain the expression 
sup{ gia,b)zu{a,b) - ^ e^'("'^)C(a, 6)a;(a)w(6)} 



w{a, b) 



sup V \ log (h{a,b) ^n"^ N ^,. ) -h{a,b)\w{a,b) 



sup Y {\ogh{a,b)-h{a,b))w{aM+ E ^°K c(a Jm!m6) ) 

h>0 a,b&X a,b£X 



= + II Coj (8) w). 

This gives Ii{uj, w) = Sjci'^ \\ which concludes the proof of the lemma. ■ 

Remark 4 It is not hard to see that i3c(' II ■) is a good rate function, as for for all a < 00, its level 
sets are the bounded, closed set {(^,-07) G A4{X) x M.(X x X): S)c{'^ \\^) < and therefore, are 
compact. 

3.10 Lower bound in Theorem I3.3( i). We use the LDP on the scale n (but with the law P 
replaced by P) to establish the lower bound for some open set O C x M.^{X x X). 

Lemma 3.12. For every open set O C M-{X) x M.^{X x X). 

liminf ^logP{(L\ L^) e 0} > - inf h{uj,w). (3.30) 

Proof. Suppose (cj, ro) S O with vcj <^ Coj ® 00. We define the function /^^ : — )• M by 

1 0, otherwise, 
and the symmetric function g.^: Af x — t- M by 



^ , I log S%^r-7rr 1 if ro(a, 6) > 0, 

I 0, otherwise. 



We recall that 

.-| l/a„ 



, for a,b £ X. 



U^\a, b) = - log [1 - pnia, b) + pnia, b)e^-^^^' 

Define the symmetric function f3^{a,b) by 

/3„(a,6) := lim U^Ha,b) = C7(a, 6)(1 - e^-^'^'*)). 

n-+oo 

Choose B^jB"^ open neighbourhoods of 00,-01 such that iJ^J^ x i?^ C O and for ah (tD,^) G i?^ x B"^, 

(ro, + (tj (g)a;, 13^) - e < (ro, g^) + (w (g)a;, p.^). 

We note that, the coloured random graph obtained from the function has colour law uj and con- 
nection probabilities pn{a,b) € [0, 1] satisfying 

a~^Pn{a, b) — )■ C{a, b) := w{a, b) / {oj{a)uj{b)), as n ^ 00. 
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Write m := A m.mP^{a, a), and / := A max/(a). Now, using (|3.15p we have that 

Therefore, we have 

hminf^logP{(L\L2)GO} 

> -i (g, ti7) - i (/3,a; u;) - e + hminf ^ logP{(Li, L^) gBIxBI}. 

The result follows once we prove that 

liminf ^logP|(L\ L^) e B^ x BI] =0. (3.31) 

To conclude the proof, we use the lower bound of the LDP on the scale n (but with the law P replaced 
by P), to prove (|3.3ip . We notice from Theorem 13. 3l fii) that, for any 5 > and for large n we have 

P{(L\ L^) eB^x Bl} > e-"°2(^.")-"'5 
where 02(1^, ro) = inf {l2{i^, ro) : (w, ro) G B^ x i?^} and 

J ~\_ f ^('^ll^) if = CtD (8" a), 
^ I 00 otherwise. 

Therefore, we have 

hniinf ^logP{(Li, L2) g B^ x BI] > lini inf ^{-02(0;, w) - 6} = 0, 

since a,in — > 00 as n ^ c«. This concludes the proof of the Lemma. 



3.11 Upper Bound in Theorem I3.4( i). To begin we obtain the upper bound in a variational 
formulation. We recall that nUn — )• for subcritical coloured graphs and write 

Zn{f) '■= ^^Una„f- 

Notice Z{f) := lim Z„(/) < 00. Define for {uj,w) G M{X) x M^{X x X) the rate Is by 



h{i^,w) = sup I ^(/(a) - Z{f))uj{a) 

(3.32) 



geC2 



+ '^~g{a,b)uj{a,b) + J] i(l - e^('^'''))C7(a, 6)^(a)a;(6)} . 



Lemma 3.13. For each closed set F C M{X) x M^{X x X), 

limsup^logP{(L\ L2) G F} < - inf /3(^^, w) 
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Proof. Fix / G Ci. For any g e C2 we define X x X Rhy ^{a,b) = (1 - e^^'''''^)C{a,b). 
Lemma 13.51 gives lim h^\a,b) = ^{a,b), for all a,b ^ X. We note that, for any 6 > and for large 

n— ^00 

n, we have 

h^n^ (a, b) < \l3{a, b)\ + 6, for ah a,b £ X (3.33) 
Taking /(a) = nanf{a), for all a G r^, and using (j3.15p and (|3.33p we have 

for any 5 > and for large n. Therefore, we have 

linisup^logE|e""'''^^''^"^"(^)>+""'''<^^''^>+''"'''<^^'®-^''''"H < 0. (3.34) 



Fix e > and take /Kw, -cu) = min{i3(a;, w),£ ^} — £- Suppose (w, w) £ F and choose f £ Ci, g G C2 
such that 

(w, /-Z(/)) + i(zi7,g) + i(w0a;, ^) >/|(L^,ti7). 
By finiteness of X, we can find open neighbourhoods B^, oi to, w such that 

inf {(cD, /-Z(/)) + (i5, tZ7) + (i(I;®w, >i|(^^, ^)-e. 

By Chebysheff's inequality and (j3.34p . we have that 
limsup^logP{(Li, L2) g 51 X S^} 



< hm sup log^f e''^n^L\ f-Z„{f))+a^n^{^L^rg)+a^n^(^L^»L\hi'^)'] 

driTl I I 



If (cj, w) + e 



< -iliuj, uj) +e. (3.35) 
By Lemma (j3.6p we choose A^(e) G N (with a = e~^) such that 

lim sup ^ log P| 1^1 > a„ViV(e)| < -e"^ 
Define for this N, the set i^Ar(e) by 

^7V(£) = {(w, ^) G -^W X M^{X X A:-) : lltull < 2iV(e)}. 

Note K]\f(^i,^ n F is compact and therefore may be covered by finitely many sets B^ x B'^^, r = 1, m 
with (ujr, ^r) G F for r = 1, ...,m. Hence, we have 

m 

P{(L\ L2) g F} < j;P{(L\ L2) g < X Bl^]+W{{L\ L') ^ 

r=l 

Using (I3.35P for small enough e > 0, we obtain 



limsup^logP{(Li, l2) g F} < maxlimsup^logP{(F\ L^) g x BI^} - 



< -/|((^, tn) + £. 

Taking e 4 we have the required statement. 



We identify the rate function by solving the variational problem in the right side of equation (j3.32p . 
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Lemma 3.14. For any {uj, w) G M.{X) x M.^[X x X), we have hiu, w) = 13(0;, w). 

Proof. (i) Suppose w G M{X) is not equal ^. Define the function / by 

/(a) = K\og{\uj{a) - ^(a)| + 1), for a G A' and G M. 

Set g{a, b) = for all a,b & X in (j3.32p and note that by the choice of / we have 

^{fia)-Zif)Ma)+ y{a,b)vj{a,b) ^(1 - e^('^'''))C(a, 6V(aV(6) 

> K Y^ log(|a;(a) — /u(a)| + l)uj{a) — max \oj{a) — ^{a)\ — 1^ -' — ^- — > 00, 

where the sign of \K\ is such that last expression always stays positive. Suppose w ^ Cu (8) w . Then 
there exists oq, bo £ X with C{aQ,bQ)uj{ao)uj{bQ) = and tu(ao,6o) > 0. For this (ao,6o) we define the 
function g by 

g{a,b) = log(i^(l(a(,^b(,)(a,6) + l(b(,^„(,)(a, 6)) + 1), 
for a,b X and K > 0. Considering our g in (I3.24p we have 

Y y{a,b)w{a,b) + |(l-e^("''')C(a,6)a;(aV(6) = log(i^ + l)(tu(ao,6o)) ^^00. 

Suppose that w <C Coj (8> uj. Then, we have 

/(w,w)>isup| Y 9{a,b)m{a,b) - Y e^^"'^)C(a, 6)a;(a)cj(&)} + i ^ C(a, 6)u;(a)a;(6). 

By the substitution h = e^ Cuj^uj ^^^^ supa,>o logx — x = — 1 we have p.29p . which yields 

isiu), vu) = Isiuj, w). 



3.12 Lower bound in Theorem 13.4( 1). 

Lemma 3.15. For every open set O C M.{X) x M.^{X x X). 

liminf^logP|(Li,L2)GO|>- inf h{uj,w). 



(3.36) 



Proof. Suppose (w, tu) G O with w <C Co; w and co = fi. Take /(a) = 0, for all a £ X. Define 
the symmetric function g^: A" x ^ — ?■ M by 



gm{a,b) 



log ciaJpH-'ib) ' if^(a>&)>0, 
0, otherwise. 



Recall that 



U^\a, b) = - log h - pn{a, b) + p„(a, 6)e3-('^'^) 



, for a, 6 G Af . 



Define the function /3^{a,b) by 

/3^(a,6) := lim h't^\a,b) = C{a,b){l - e^^^"'''^). 
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Choose B^,B^ open neighbourhoods oico, zu such that x B'^ C O and for all {uj,w) € B^ x B^, 

(ro, g^) + (w (g) ^i^) - e < (ro, 5^) + (w ci, ^^). 

We note that, the random coloured graph obtained from the function has colour law uj and con- 
nection probabilities satisfying 

a~^Pn{a,b) — )■ C{a,b) := m{a,b)/{uj{a)uj{b)), as n ^ 00. 

Using (I3.12P we have 

f{{L\ L') G 0} > E{f (X)]l{(^.,^.),s. J 

> g-a„n2(^ ro,gtn)-a„n2(^aj(g)a;,/3)+a„m/4-^ OnJi^g ^ ]p|(|^l g ^1 ^ ^2 | 

where m := A minag;f I3^{a, a). Therefore, we have 
liminf^logP{(Li, L2) €0} 

>-i(zn, 5)-i(a;^a;, /3) - e + liminf ^ logP{(L\ L^) G i?i x i?^}. 

The result follows once we prove that 

hminf ^logP{(Li, e bI x bI} = 0. (3.37) 

We use the upper bound (but now with the law P replaced by P) to prove p.37p . 



Therefore, we have 



limsup^logP{(L\ L^) G {B'^ x BIY} < - inf _ h{co, w) 

n->oo " (aj,ro)eF 



73(0), w) 



ii^^(tz7 II tj) if o) = a;, 
00 otherwise. 



where F = {B^ x B^Y and (S^ x i?^)"^ is the complement of the set B^ x B^. It remain for us to 
show that the infimum is positive. Suppose by contradiction there exists the sequence (w^, vjn) S F 
such that /3(a), ro) 4 0. Then, because Is is good rate function with all its level sets compact, and by 
lower semicontinuity of the mapping {oj, w) — >■ IziyJ, '^), we can construct a limit point {w, oj) £ F 
with Is^Co, w) = 0. This means oj = uj and w = CCj uj = zu, and hence, contradicting (uj, uj) £ F. m 

3.13 Upper bound in Theorems ISTilfii). We define for {uj,w) G M{X) x M*{X x X), the 
function l4,{uj, w) by 

14(0;, w) = sup I ^ (/(a) - C//)w(a)} (3.38) 

/eCi aGA- 

Lemma 3.16. For each closed set F C M.*{X x A"), we have 

limsupilogP{(L\ L^) G F} < - inf 14(0;, ro). 

n->-oo (w, ti7)e-F 
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Proof. Fix / € Ci and take g{a, b) = 0, for all a,b (z X. We observe that by this choice of g 

m\a,b) = 0, for all a,beX. 

Using (IXT2|) we obtain E{e"^-^''^~^/^} = / < 1 and therefore, we have 

limsupilogE|e"<-^''^'"^/^| < 0. (3.39) 

Now fix e > and write /|(u;, w) := min{i4(a;, zu),e^^} — e. We suppose (w, w) € F and choose 
/ € Ci such that 

(/ - Uj^,uj) > Il{uj, zu). 
By finiteness of X, we can find open neighbourhoods and of zu, w such that 

inf {((D, f-Uf)}>il{oj,w)-e. 

Using Chebysheff's inequality and ()3.39p . we have that 



-1 



(3.40) 



limsupilogP{(L\ L2) g ^1 X Bl] < limsup i logE|e"<^''^-^/>| - /|(6^,tu) -] 

< -/|(a;,ro) + e. 

By Lemma 13.61 with a = e~^, we choose N{e) G N such that 

limsupilogPjl-Bl > ann^N{e)\ < -e~\ 

We define for this the set Kj^(^^-^ by 

KN(e) = {{^. G -^W X M^{X X ;f) : ||w|| < 2N{e)]. 
The set Kj^(^^-j n F is compact and therefore may be covered by finitely many sets 
B^^ X B^^, r = 1, . . . , m with (w,., zu^) G F for r = 1, . . . , m. Hence, we have 

m 

P{(L\ L') G F} < ^P{(L\l2) g < X +P{(L\l2) ^ 

r=l 

Now we use p.40p to obtain, for all sufficiently small e > 0, 

limsupilogP{(L\L2) ^F}< maxlimsup MogP{(Li, L^) g x BI^} - e 

n— >oo '"—1 n— >-oo 

< — inf /|(u;, tu) + e. 

(aj,ro)GF 

Taking e J, we have the desired statement. ■ 

By the variational characterization of the relative entropy we have 14(1:1^, tu) = H{uj \\ n), which ends 
the proof of the upper bound. 

3.14 Lower bound in Theorems I3.4l (ii) 

Lemma 3.17. For every open set O C M{X) x M.^[X x X), 

liminf ilogPf(L\L2) g o| > - inf 74(0;, w). (3.41) 
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Proof. Suppose (w, vj) £ O with w <C Cuj ® oj. We define the function : A" — )• M by 

= | log^' ifw(a)>0, 
1 0, otherwise, 

and the symmetric function : X x X ^ M hy 

g^{a, b) = l CiaJMalib) ' > 

I 0, otherwise. 

Set g^{a, b) = nang-uuio-, b), for all a,b £ X and note by the choice of g we have 

lim U^\a,b) = 0, for aU a,beX. 

n— >oo 

Choose Bl,Bl open neighbourhoods of u, w such that x i?^ C O and for all (tD, ro) G i?^ x B"^, 

{uj, fj) + i ||ti7||e - e < (cD, ^) + i ||tz7||e. 

We use the probability measure P given by g-^ . We observe that the colour law is uj and the connection 
probabilities satisfy a~^pn{a,b) C{a,b) := vj{a,b)/{uj{a)uj{b)), as n approaches infinity. Therefore, 
using (|3.12p we have that 

p{(L\l2)g0} >E{f 

where / = A max g.^{a, b). Therefore, we have that 

liminfilogP{(L\L2) GO} > -{u, /^) - i ||tn||e - e + liminf i logP{(L\ L^) G 5^ x 5^}. 

The result follows once we prove that 

hminf ilogP{(L\L2) G i?^ x bI} = 0. (3.42) 

We use the lower bound of the LDP on the scale a„n^ (but now with the law P replaced by P) to 
prove (I3.42p . We observe from the lower bound of Theorem I3.4( i) that, for any 6 > and for large n 
we have 

P{(L\ L'^) G si X Bl} > e-'^"'^'(°3(<^,^)+5)^ 
where as{LO, zu) = inf \^I^{lo, w) : (tl>, w) G B^ x -B^} and 



13 {u!, w) 



oo otherwise. 



Therefore, we have that 

liminf i logP|(L\ L^) G x B^j > liminf a„n{-a3(a;, w) - 5} = 0, 

since ar^?^ — )• as n approaches infinity. This concludes the proof of the Lemma. 
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3.15 Proof of Theorem 12.11 Recall A^o from the boundedness of Q and write 

No 
n=0 

We equip it with the discrete topology. We recall also that vr is the unique eigenvector (normalized 
to a probability vector) of the matrix A corresponding to the eigenvalue 1. We recall that P„ is the 
law of a multitype Galton- Watson tree conditioned to have n vertices, and derive from Theorem 13.11 
the following weak law of large numbers. 

Lemma 3.18. Suppose that X is an irreducible, critical multitype Galton-Watson tree with bounded 
offspring law Q, conditioned to have exactly n vertices. Then, for any e > 

lim P„| max \Mx(a,c) - tt (g> Q(a,c)\ > e\ = 0. (3.43) 

Proof. Define the closed set 

F = \ 1^ e Mix X Xo) : max |i/(a, c) - vr ® Q(a, c)| > e |. 

{a,c)eXxX* ' ' ^ 

We observe from Theorem 13. II that , 

limsupilogP„{Mx e F} < - inf J{u). (3.44) 

We show by contradiction that the right hand side of (|3.44p is negative. 



To do this, we suppose that there exists sequence Un in F such that J{i'n) i 0. Then, because J is a good 
rate function and its level sets are compact, and by lower semicontinuity of the mapping i-^ '^(^^)) 
there is a limit u ^ F with J{i') = 0. Hence, we have that v is shift-invariant and H{v \\ i^i (8> Q) = 0. 
This implies v{a, c) = vi (8)Q(a, c), for every (a, c) G <Y x X^. Using shift-invariance of v, for any b £ X, 
we have 

j4(6, a)z^i(a) = Q{c\a}m{b,c)i^i{a) = v{a,c)m{b,c) = vi{b). 

aeX (a,c)GA'xA'o* {a,c)eXxX* 

This means that vi is a nonnegative eigenvector of A. By uniqueness of the Perron-Frobenius 
eigenvector, see, for example Dembo et al. [10\ Theorem 3.1.1(d)], we infer that ui = tt. This 
contradicts vr (g) Q F. ■ 

We recall that T is set of all finite rooted planar trees T,V = V(T) is set of all vertices and \T\ is the 
number of vertices in the tree T. We now compute the probability weight Pn{x) of x € T as 

Pn(x) = |f^ n Q{C{v) = c{v)\Xiv)=x{v)}, 

veV(T),\T\=n 

{x{v), c{v)) is the type, and the configuration of children of vertex f of a; € T. Therefore, we have that 
-ilogP„(x) = -ilog/i(x(p)) + ilogP{|T| = n} + (A4, -logQ). 
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Now the term — log^{x{p)) converges to zero, while the term -logP{|r| = n} converges to zero 
because Q is bounded. See Dembo et al. [SI Lemma 3.1]. We observe that — logQ is almost 
surely bounded on the support of Mx and therefore, by Lemma 13.181 we have {M^, — logQ) — >■ 
(vr (8) Q, — logQ), which concludes the proof of Theorem 12.11 

3.16 Derivation of Theorems [2T2] and [2T3l 

We recall that is the law of a coloured random graph with n vertices, and derive from our large 
deviation principles for coloured random graphs the following weak law of large numbers. 

Lemma 3.19. Suppose that X is a coloured random graph with colour law fi: X ^ [0,1] and con- 
nection probabilities pn ■ X x X ^ [0,1] such that a~^pn{a,b) — > C{a,b) for some sequence (an) with 
a„n — > or o„n 1 or a^n — > oo and C : ^ x — > [0, oo) nonzero. Then, for any e > {) we have 

lim P„,| sup |L^(a) - /x(a)| > e| = 

and 

lim P„{ sup \L^{a,b)-^i{a)C{a,h)ii{b)\>e]=0. 

"■^"^ a,bex 

From Theorem 13. 2( Theorem 13.31 and Theorem 13.41 we prove this lemma. 
To begin, we define a closed set 

Fi = G M{X) X M^X X X): sup \zu{a,b) - fi{a)C{a,b)ij{b)\ > e}. 

a,beX 

We observe that in the sparse case (when na„ ^ 1), by Theorem 13. 2| 

limsupilogP„|(L\L2) G Fij < - inf I{uj,w). (3.45) 

We show by contradiction that the right handside of (|3.45p is negative. For this purpose suppose that 
there exists sequence {uin,'cun) in Fi such that I{ijJn-,Wn) 4- 0. Then, because / is a good rate function 
and its level sets are compact, and by lower semicontinuity of the mapping {u^w) ^ I{u},zu), there 
is a limit point {u},w) G Fi with I{uj,w) = 0. By Doku et al. [3 Lemma 3.4], we have H{uj || /i) = 
and S)c{'^\\^) = 0. This implies oj{a) = /i(a), and tu(a,6) = C{a,b)uj{a)uj{b), for a,b G X which 
contradicts {uj,w) G Fi. Hence as required. 

For the subcritical case we can argue similarly with the LDP on the scale a„n^ with rate function J3. 
The first statement of Lemma 13.191 follows similarly using the set 

F2 = {{oj,w) G M{X) X M*{X X X) : sup |a;(a) - ^(a)| > e] 

and the LDP of Theorem 13.21 in the sparse case, and the LDP on the scale n with rate function 14 
in the subcritical case. Finally, in the supercritical case, an analogous argument can be carried out 
using F = Fi\J F2 and the LDP on the scale n with rate function I2 ■ 

We recall that V is a fixed set of n vertices, say V = {1, . . . , n}, Qn is the set of all (simple) graphs 
with vertex set F = {1, . . . , n} and E <Z £ := {{u,v) xV : n < u} the edge set. 
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We now compute the probability weight P„ (x) of x G t7„ as 

Pn{x) = Y[ Kx{u)) Yi Pn{x{u),x{v)) JJ (l - p„(x(u), x(u))) 
u£V (u,v)eE {u,v)^E 



-p„ (a;(u),a;(i;)) 



Therefore, we have in the case of Theorem [ 

; n ) 

2 \ ' a„ log n ' 2 \ ^ ' a„ n log n ' 



1 Inr^ p r-r^l — /T^ logM \ , 1 /r2 log(pn/(l-pn)) \ 

a„n^ logn -f^nl-^j — ) a„ n logn/ 2 W ' logn / 



logn' ^ 2 \ ' logn 

l/fl^rl log(l-pn)\ , 1 /rl log(l-Pn)\ 



In the case of Theorem 12.31 we have 



-ilogP„(^) = {L\-log^^) + i 



log n 

+ ^{L^^L\ -nlog(l - p„)) + i (Li, - log(l - pn)). 

Now in the first case the integrands „ , a-J^d ^^"^'''1^^!'-^^ all converge to zero, while 

"'"'^^^ogn"^"^^ ^ 1- f^^^c^ Theorem follows from Theorem [3l3 

In the second case both integrand — log(l — Pn) and — n log(l — Pn) converges to zero. Therefore, 
Theorem 12.31 follows from Lemma 13.191 
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